Gender Equity in College Sports

Massive Data Fundamentals final project

Author

Bridgette Sullivan, Jasmine Jia, Katharyn Loweth, Maria Bartlett

Policy context

Equity in Athletics Disclosure Act (EADA)

Data

Research questions

Data pipeline

Analyses

Exploratory

```{r}
#| eval: false
head(mtcars)
```

Unsupervised

Institution Level

Using Principle Component Analysis (PCA) and K-means clustering, practicioners are able to better understand universities that are similar to each other in their sports equity. Below, there are five clusters, which seem to emerge based on school sizes. On the far right side of the graph is the large universities with higher amounts of spending and larger teams. These universities do skew the graph a bit since they are spending so much more on their sports teams in comparison to the many small schools that are primarily on the left side of the graph.



There are additional use cases where a school administrator may want to see similar universities to their own in this clustering algorithm. The next two examples show what administrators for schools like Georgetown or Furman might see when entering in their own universities and finding their nearest neighbors. Some of the results may be obvious, but others may be less so, leading to a reason for different schools to connect on their gender equity in sports and how they can improve or understand other programs. For example, Georgetown is close to many schools in their conference, but schools like East Carolina and Old Dominion are not far away from them and would maybe be less obvious schools to connect with.



As for Furman, the schools nearest to them seem much less intuitive, with most being on the west coast which is quite the distance from the South Carolina school. These connections could help to facilitate unique discussions on how gender equity can be improved across various regions.

Sport Level

Project pitch